CCBR-638

Maggie Cam

Nov 11, 2015

Background

We want to analyze two RNA-Seq datasets derived from total RNA of fractionated Protrusions (Ps) and cell body (CB) fractions of mouse fibroblast cells. We are interested in RNAs that are enriched in protrusions (i.e. have a high Ps/CB ratio) and want to identify subsets of these RNAs that are co-regulated by different factors.

The first dataset aims at identifying RNAs whose enrichment at protrusions is (or is not) affected by knockdown of the APC tumor suppressor protein. We have sequenced 4 control (si-control) and 4 APC knockdown (si-APC) replicates. Each replicate consists of paired Ps and CB fractions (i.e. 16 samples total).

The second dataset aims at identifying RNAs whose enrichment at protrusions is affected by expression of a competing UTR construct, which mislocalizes RNAs through sequestration of necessary factors. We have sequenced 4 control (HBB) and 4 experimental (Pkp4) replicates. Each replicate consists of paired Ps and CB fractions (i.e. 16 samples total).

Data Processing

## Bioconductor version 3.2 (BiocInstaller 1.20.3), ?biocLite for help
## A new version of Bioconductor is available after installing the most
##   recent version of R; see http://bioconductor.org/install

This part of the program was run on biowulf, and data transferred to a local directory

#Command line version
module load subread
x=$(ls *.bam)
featureCounts -p -T 8 -s 2 -p -t exon -g gene_id -a /data/maggiec/RNASeq/Genomes/mm10/gencode.vM4.all.gtf -o counts_ss.txt $x

#Used R version:
gtf="/data/maggiec/RNASeq/Genomes/mm10/gencode.vM4.all.gtf"
targets <- readTargets()

fc <- featureCounts(files=targets$bam,isGTFAnnotationFile=TRUE,nthreads=32,
      annot.ext=gtf,GTF.attrType="gene_name",strandSpecific=2,isPairedEnd=TRUE)
x <- DGEList(counts=fc$counts, genes=fc$annotation)

Load data from local directory:

## [1] "fc"      "gtf"     "targets" "x"
##                           bam Cell Compartment Replicate Phenotype
## 1      Sample_si.APC_CB.1.bam  APC          CB         1    APC_CB
## 2      Sample_si.APC_CB.2.bam  APC          CB         2    APC_CB
## 3      Sample_si.APC_CB.3.bam  APC          CB         3    APC_CB
## 4      Sample_si.APC_CB.4.bam  APC          CB         4    APC_CB
## 5      Sample_si.APC_Ps.1.bam  APC          Ps         1    APC_Ps
## 6      Sample_si.APC_Ps.2.bam  APC          Ps         2    APC_Ps
## 7      Sample_si.APC_Ps.3.bam  APC          Ps         3    APC_Ps
## 8      Sample_si.APC_Ps.4.bam  APC          Ps         4    APC_Ps
## 9  Sample_si.control_CB.1.bam  Con          CB         1    Con_CB
## 10 Sample_si.control_CB.2.bam  Con          CB         2    Con_CB
## 11 Sample_si.control_CB.3.bam  Con          CB         3    Con_CB
## 12 Sample_si.control_CB.4.bam  Con          CB         4    Con_CB
## 13 Sample_si.control_Ps.1.bam  Con          Ps         1    Con_Ps
## 14 Sample_si.control_Ps.2.bam  Con          Ps         2    Con_Ps
## 15 Sample_si.control_Ps.3.bam  Con          Ps         3    Con_Ps
## 16 Sample_si.control_Ps.4.bam  Con          Ps         4    Con_Ps
##                        Status Sample_si.APC_CB.1.bam
## 1                    Assigned               20305088
## 2        Unassigned_Ambiguity                 308973
## 3     Unassigned_MultiMapping                3871900
## 4       Unassigned_NoFeatures                 840823
## 5         Unassigned_Unmapped                      0
## 6   Unassigned_MappingQuality                      0
## 7  Unassigned_FragementLength                      0
## 8          Unassigned_Chimera                      0
## 9        Unassigned_Secondary                      0
## 10     Unassigned_Nonjunction                      0
## 11       Unassigned_Duplicate                      0
##    Sample_si.APC_CB.2.bam Sample_si.APC_CB.3.bam Sample_si.APC_CB.4.bam
## 1                17703207               18009731               15908957
## 2                  269120                 268304                 241514
## 3                 3428255                3457842                3135614
## 4                  739520                 834576                 686874
## 5                       0                      0                      0
## 6                       0                      0                      0
## 7                       0                      0                      0
## 8                       0                      0                      0
## 9                       0                      0                      0
## 10                      0                      0                      0
## 11                      0                      0                      0
##    Sample_si.APC_Ps.1.bam Sample_si.APC_Ps.2.bam Sample_si.APC_Ps.3.bam
## 1                16214065               14963100               19532841
## 2                  325964                 305871                 371835
## 3                 5844378                6089572                6446149
## 4                  458341                 470529                 511897
## 5                       0                      0                      0
## 6                       0                      0                      0
## 7                       0                      0                      0
## 8                       0                      0                      0
## 9                       0                      0                      0
## 10                      0                      0                      0
## 11                      0                      0                      0
##    Sample_si.APC_Ps.4.bam Sample_si.control_CB.1.bam
## 1                16877618                   16310291
## 2                  327423                     260673
## 3                 5711504                    3377245
## 4                  445550                     671136
## 5                       0                          0
## 6                       0                          0
## 7                       0                          0
## 8                       0                          0
## 9                       0                          0
## 10                      0                          0
## 11                      0                          0
##    Sample_si.control_CB.2.bam Sample_si.control_CB.3.bam
## 1                    16325320                   20138987
## 2                      264664                     301984
## 3                     3393474                    4098971
## 4                      688194                    1019531
## 5                           0                          0
## 6                           0                          0
## 7                           0                          0
## 8                           0                          0
## 9                           0                          0
## 10                          0                          0
## 11                          0                          0
##    Sample_si.control_CB.4.bam Sample_si.control_Ps.1.bam
## 1                    17292738                   15542005
## 2                      267364                     329631
## 3                     3438365                    6595530
## 4                      766768                     364223
## 5                           0                          0
## 6                           0                          0
## 7                           0                          0
## 8                           0                          0
## 9                           0                          0
## 10                          0                          0
## 11                          0                          0
##    Sample_si.control_Ps.2.bam Sample_si.control_Ps.3.bam
## 1                    13471411                   17150190
## 2                      278749                     317132
## 3                     5669362                    5787571
## 4                      365185                     402302
## 5                           0                          0
## 6                           0                          0
## 7                           0                          0
## 8                           0                          0
## 9                           0                          0
## 10                          0                          0
## 11                          0                          0
##    Sample_si.control_Ps.4.bam
## 1                    15704500
## 2                      299725
## 3                     5436986
## 4                      356842
## 5                           0
## 6                           0
## 7                           0
## 8                           0
## 9                           0
## 10                          0
## 11                          0

Mapping Rate:

QC Check: Look at raw signal distribution and median expression levels

Data is normalized by TMM: filtered number of genes

## [1] 13015    16

Run Voom

## null device 
##           1

After Normalization

## Warning: package 'rglwidget' was built under R version 3.2.3
## Using  as id variables

## null device 
##           1

You must enable Javascript to view this page properly.

Run Similarity Heatmap

Statistical Analysis of Experimental groups (lmFit)

##    celltypeAPC_CB celltypeAPC_Ps celltypeCon_CB celltypeCon_Ps
## 1               1              0              0              0
## 2               1              0              0              0
## 3               1              0              0              0
## 4               1              0              0              0
## 5               0              1              0              0
## 6               0              1              0              0
## 7               0              1              0              0
## 8               0              1              0              0
## 9               0              0              1              0
## 10              0              0              1              0
## 11              0              0              1              0
## 12              0              0              1              0
## 13              0              0              0              1
## 14              0              0              0              1
## 15              0              0              0              1
## 16              0              0              0              1
## attr(,"assign")
## [1] 1 1 1 1
## attr(,"contrasts")
## attr(,"contrasts")$celltype
## [1] "contr.treatment"
## [1] "APC_CB" "APC_Ps" "Con_CB" "Con_Ps"

Volcano Plot

## [1] 11105    16

####Heatmap: Top genes

Provenance:

## R version 3.2.1 (2015-06-18)
## Platform: x86_64-apple-darwin13.4.0 (64-bit)
## Running under: OS X 10.10.5 (Yosemite)
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] parallel  grid      stats     graphics  grDevices utils     datasets 
## [8] methods   base     
## 
## other attached packages:
##  [1] reshape2_1.4.1       d3heatmap_0.6.1.1    plotly_3.6.0        
##  [4] amap_0.8-14          rglwidget_0.1.1434   HTSFilter_1.8.0     
##  [7] Biobase_2.30.0       BiocGenerics_0.16.1  IDPmisc_1.1.17      
## [10] lattice_0.20-33      reshape_0.8.5        knitr_1.13          
## [13] rgl_0.95.1441        ggplot2_2.1.0        edgeR_3.12.1        
## [16] limma_3.26.9         Rsubread_1.20.6      BiocInstaller_1.20.3
## 
## loaded via a namespace (and not attached):
##  [1] viridis_0.3.4              httr_1.2.1                
##  [3] tidyr_0.5.1                jsonlite_1.0              
##  [5] splines_3.2.1              assertthat_0.1.0.99       
##  [7] Formula_1.2-1              shiny_0.13.2              
##  [9] stats4_3.2.1               latticeExtra_0.6-28       
## [11] yaml_2.1.13                RSQLite_1.0.0             
## [13] chron_2.3-47               digest_0.6.9              
## [15] GenomicRanges_1.22.4       RColorBrewer_1.1-2        
## [17] XVector_0.10.0             colorspace_1.2-6          
## [19] htmltools_0.3.5            httpuv_1.3.3              
## [21] Matrix_1.2-6               plyr_1.8.4                
## [23] DESeq2_1.10.1              XML_3.98-1.4              
## [25] genefilter_1.52.1          zlibbioc_1.16.0           
## [27] xtable_1.8-2               scales_0.4.0              
## [29] BiocParallel_1.4.3         tibble_1.1                
## [31] annotate_1.48.0            IRanges_2.4.8             
## [33] DT_0.1                     SummarizedExperiment_1.0.2
## [35] nnet_7.3-12                survival_2.39-5           
## [37] magrittr_1.5               mime_0.5                  
## [39] evaluate_0.9               RcppArmadillo_0.7.200.2.0 
## [41] foreign_0.8-66             tools_3.2.1               
## [43] data.table_1.9.6           formatR_1.4               
## [45] stringr_1.0.0              S4Vectors_0.8.11          
## [47] munsell_0.4.3              locfit_1.5-9.1            
## [49] cluster_2.0.4              AnnotationDbi_1.32.3      
## [51] lambda.r_1.1.9             DESeq_1.20.0              
## [53] GenomeInfoDb_1.6.3         futile.logger_1.4.3       
## [55] htmlwidgets_0.6            base64enc_0.1-3           
## [57] labeling_0.3               rmarkdown_1.0             
## [59] gtable_0.2.0               DBI_0.4-1                 
## [61] markdown_0.7.7             R6_2.1.2                  
## [63] gridExtra_2.2.1            Hmisc_3.17-4              
## [65] futile.options_1.0.0       stringi_1.1.1             
## [67] Rcpp_0.12.6                png_0.1-7                 
## [69] geneplotter_1.48.0         rpart_4.1-10              
## [71] acepack_1.3-3.3